Introduction to RCSI and Improved Entity Framework Locking Hint Interceptor
A few days ago, I accidentally discovered that Google's NotebookLM had actually supported Traditional Chinese in its Podcasts since April 30, 2025 (ironically, until January 2026, Gemini kept insisting to me that it didn't). On a whim, I fed some of my technical notes into it, thinking I could convert them into a podcast to listen to during my commute.
Since Antigravity suggested that my notes were full of decision-making processes, I set the style to "Debate." After all, I don't consider my notes to be absolutely correct, so I also prompted it to help me check for any errors or outdated content.
The resulting content left me a bit dumbfounded; it felt like much of the criticism was for the sake of criticism:
- Debating for the sake of debating: In my notes, I mentioned that I am accustomed to using
DateTimeOffset, and I recalled a pitfall a colleague encountered when usingDateTimewith UTC in Entity Framework (I identified the cause and provided a solution). The conclusion of the review was: I should have just told my colleague to switch toDateTimeOffsetinstead of wasting time struggling withDateTime? - Missing the point: The two AI hosts spent a large portion of the time attacking my "Introduction" and "Background" while ignoring the technical details of the main content.
- Calling a deer a horse: I was merely demonstrating syntax or discussing certain mechanisms, and I explicitly wrote in the text that I did not recommend using them or noted potential issues. Yet, it claimed I was "promoting" these practices?
Well, I admit I'm a bit sensitive, so I ended up deleting all those podcasts.
However, I can't say I gained nothing; I heard a key term: RCSI (Read Committed Snapshot Isolation).
I remember that around 2018, when interviewing for companies with high-traffic or high-concurrency systems, knowing how to use WITH (NOLOCK) in SQL Server was almost a mandatory question. It showed you knew how to prevent queries from being blocked by writes. But after checking recent materials, I found that this practice is now considered an Anti-Pattern in some scenarios.
Since I've been busy and a bit lazy lately, I'll just jot down the key points here:
What is RCSI?
The full name is Read Committed Snapshot Isolation. Simply put, it solves locking issues through the concept of "version control." When someone is writing data, read operations are not blocked, nor do they read "dirty" data that is in the middle of being written. Instead, they read the "last committed version (Snapshot) before the write."
Is RCSI a global switch?
Yes, RCSI (READ_COMMITTED_SNAPSHOT) is a Database Level setting.
Once enabled (ALTER DATABASE ... SET READ_COMMITTED_SNAPSHOT ON), it changes the behavior of all queries using the Read Committed isolation level (the default) throughout the entire database.
- It cannot be enabled for "specific tables."
- It is globally effective; all queries that do not explicitly specify an isolation level will automatically become "Snapshot Read" mode.
TIP
If you want to use snapshot isolation "selectively," you must use Snapshot Isolation and explicitly specify IsolationLevel.Snapshot when creating a transaction in EF. However, this level of fine-grained control places high demands on the development team. For projects with frequent transactions, honestly, even I am not confident that I could spend the time to clarify and make the correct isolation level judgments for every single case.
WARNING
Please note that RCSI primarily solves locking issues related to queries (avoiding read blocking by writes). If you need to handle concurrency conflicts during data updates (e.g., two people modifying the same record simultaneously), you must still use the RowVersion (Optimistic Locking) mechanism to ensure data consistency. RCSI cannot replace RowVersion.
RCSI vs. NOLOCK vs. READPAST
WITH (NOLOCK): To avoid being locked, I am willing to read "Dirty Data (Dirty Read)."
WITH (READPAST): If a row is locked, just "skip" it. Suitable for queue processing, but not for general reports (because data will be missing).
RCSI: Read consistent "Snapshot Data." Achieves non-blocking reads without adding hints.
TIP
Whether it's a dirty read or a snapshot read, in high-concurrency scenarios, there is no guarantee of getting the "latest state" of the data, as the data is constantly changing. The difference is: a snapshot read only reads the last committed historical version, while a dirty read might read an incomplete intermediate state, which destroys atomicity and thus carries a significantly higher level of risk.
Why wasn't it used before, and why is it mainstream now?
The cost of RCSI is that it uses TempDB heavily to store data versions. In the era of traditional mechanical hard drives (HDD), this would cause severe I/O bottlenecks. But the mainstream environment has changed:
- SSD Popularity: I/O speed has increased significantly, and the burden on TempDB is no longer a fatal flaw.
- Cloud Defaults: For example, Azure SQL Database has RCSI enabled by default.
Therefore, if the hardware can handle it (space and I/O), enabling RCSI directly is a cleaner solution than adding NOLOCK everywhere (especially since using NOLOCK in Entity Framework usually requires an interceptor, which is cumbersome).
TagWith + Interceptor Improved Implementation
Of course, RCSI is the best solution nowadays, but not all projects are suitable for it, and we still need to control locking levels.
I previously wrote a note on How to add WITH (NOLOCK) and Parameter Sniffing handling in Entity Framework. The solution at the time was to use DbCommandInterceptor combined with Regex to force NOLOCK on all SELECT statements.
To be honest, I was never satisfied with that solution. When running list reports, data accuracy might not need to be that high; but when you are retrieving data for updates, reading dirty data is very dangerous. At the time, I could only use "exclude TOP 2" as a workaround, but if it were batch data processing (which requires fetching list data), it would still cause issues.
The Discovery of TagWith
Back in 2024, a colleague found the usage of TagWith while dealing with Parameter Sniffing issues. I mistakenly thought it was just adding a static comment and then using an interceptor to perform string replacement. I recently realized that TagWith is a built-in feature of EF Core that allows the comment to be fully carried into the generated SQL, which gives us a perfect "switch."
The Improved Solution
In this example, I choose not to add WITH (NOLOCK) and OPTION by default, but rather to use TagWith for explicit declaration. The interceptor only intervenes when I explicitly state that this query requires NOLOCK.
1. Define Extension Methods
Make the code look like a Fluent API for clearer semantics.
public static class EfHintExtensions {
public const string TagNoLock = "SQL_HINT: NOLOCK";
public const string TagReadPast = "SQL_HINT: READPAST";
public const string TagOptionUnknown = "SQL_OPTION: OPTIMIZE FOR UNKNOWN";
public const string TagRecompile = "SQL_OPTION: RECOMPILE";
public static IQueryable<T> WithNoLock<T>(this IQueryable<T> query) => query.TagWith(TagNoLock);
public static IQueryable<T> WithReadPast<T>(this IQueryable<T> query) => query.TagWith(TagReadPast);
public static IQueryable<T> WithOptionUnknown<T>(this IQueryable<T> query) => query.TagWith(TagOptionUnknown);
public static IQueryable<T> WithRecompile<T>(this IQueryable<T> query) => query.TagWith(TagRecompile);
}2. Implement the Interceptor
This version of the interceptor has two improvements:
- Schema Support: Improved the Regex to support formats with schemas like
[dbo].[Table](the old Regex would fail). - Option Merging: Automatically handles the merging of multiple options to avoid SQL syntax errors.
public class SqlTaggingInterceptor : DbCommandInterceptor {
private static readonly RegexOptions regexOptions = RegexOptions.Multiline | RegexOptions.IgnoreCase;
// Improved Regex: Supports Schema (e.g., [dbo].[Table])
// Theoretically, table names shouldn't have spaces, but it's technically allowed, so we handle it here
private static readonly Regex tableAliasRegex = new(
@"(?<tableAlias>(?:FROM|JOIN)\s+(?:\[[^\]]+\]\.)?\[[^\]]+\]\s+AS\s+\[[^\]]+\])(?!\s+WITH\s*\()",
regexOptions
);
// Intercept Reader and Scalar (Count/Any must also be handled!)
public override InterceptionResult<DbDataReader> ReaderExecuting(
DbCommand command, CommandEventData eventData, InterceptionResult<DbDataReader> result
) {
FixCommand(command);
return base.ReaderExecuting(command, eventData, result);
}
// ... (Async and Scalar methods omitted, remember to call FixCommand in all of them)
private static void FixCommand(DbCommand command) {
if (string.IsNullOrWhiteSpace(command.CommandText)) {
return;
}
string text = command.CommandText;
bool isChanged = false;
// 1. Handle Table Hints (NOLOCK / READPAST)
string hintToApply = null;
if (text.Contains(EfHintExtensions.TagNoLock)) {
hintToApply = "WITH (NOLOCK)";
}
else if (text.Contains(EfHintExtensions.TagReadPast)) {
hintToApply = "WITH (READPAST)";
}
if (hintToApply != null) {
text = tableAliasRegex.Replace(text, $"${{tableAlias}} {hintToApply}");
isChanged = true;
}
// 2. Handle Query Options
List<string> options = new ();
if (text.Contains(EfHintExtensions.TagOptionUnknown)) {
options.Add("OPTIMIZE FOR UNKNOWN");
}
if (text.Contains(EfHintExtensions.TagRecompile)) {
options.Add("RECOMPILE");
}
if (options.Count > 0) {
text = text.TrimEnd().TrimEnd(';');
text += $" OPTION ({string.Join(", ", options)});";
isChanged = true;
}
if (isChanged) {
command.CommandText = text;
}
}
}Actual Usage
Now, you can control the behavior of each query without worrying about breaking other SQL statements:
// Report: Allow dirty read + solve parameter sniffing
List<Order> orders = context.Orders
.WithNoLock()
.WithOptionUnknown()
.ToList();
// Queue: Skip locked rows
Job job = context.Jobs
.WithReadPast()
.FirstOrDefault();Conclusion
Whether it's WITH (NOLOCK) or Option for handling parameter sniffing, there are now more suitable mainstream approaches handled at the database level, such as enabling RCSI or using Query Store. If hardware and operations can keep up (I/O performance and TempDB space monitoring), especially if you have a DBA, I recommend using them. In most read-heavy, write-light, query-oriented systems, RCSI is a reasonable default. It fundamentally solves read-write conflicts without needing to add hints everywhere in the code. And if the goal is to solve Parameter Sniffing, the Query Store introduced since SQL Server 2016 also provides better execution plan management and enforcement mechanisms.
Of course, all of this assumes that hardware resources (TempDB space and I/O) can keep up, and it is best to have a DBA assist with evaluation and monitoring. But in modern cloud environments or with SSD equipment, these costs are usually worth the investment.
Extended Discussion: Reflections on Global Interception and Invisible Mechanisms
In the process of discussing RCSI and TagWith, one controversy is unavoidable: Should you use a framework to perform global processing?
Regarding the AI host's opinion that "you shouldn't start from a global level" and "you shouldn't use invisible black magic," I hold a different view.
Whether it's Entity Framework or ASP.NET Core, the framework itself does many similar things. When someone criticizes that "modifying SQL from an interceptor makes the generated SQL different from what the developer expects," do developers really think they can fully grasp every single SQL statement generated by EF?
In fact, most people might only understand a portion of the mechanisms (myself included), and some might not even know what the framework is helping to handle behind the scenes (e.g., the impact of Parameter Sniffing).
For example:
ASP.NET Core Case Conversion: In the early days of ASP.NET Framework, JSON property names matched C# DTOs by default (PascalCase), which felt awkward for front-end developers accustomed to camelCase. Therefore, ASP.NET Core changed the default to force conversion to camelCase. This is a classic example of framework-level global processing.
Model Validation: In the past, we had to manually check
ModelState.IsValidin Actions, but later, ASP.NET Core Web API even started intercepting validation-failed requests by default and returning 400, without the developer writing a single line of code.
So, in many cases, performing global processing from the framework is actually the right choice. The issue is not "whether it is global," but which approach better reduces the Cognitive Load of the development team.
The Trade-off Between Cognitive Load and Default Behavior
AsNoTracking() is also a great example. The data retrieved has better performance because it lacks some tracking information. Theoretically, we would suggest that teams always add AsNoTracking() to "queries that do not require modification."
But in practice, we easily see two types of chaos:
- Should have added it but didn't: Leading to failure when trying to Update later (because it was tracked).
- Shouldn't have added it but did: Leading to an inability to modify, and people who don't understand the situation manually changing
EntityStateto force modification, resulting in even more chaotic code.
Instead of relying on the developer's "self-discipline" or "memory," it is better to split the architecture: separate Query Service and Repository. Any data coming from the Query Service has AsNoTracking() by default. This way, when the team uses the Query Service, they don't have to think about "whether to add NoTracking" because the architecture has already made the safest default decision for you.
This is the point I want to emphasize: Look at which scenario is more common, which is easier to distinguish, and which carries the lowest risk if judged incorrectly.
We can evaluate "whether to perform global processing" from three dimensions:
Which scenario is more common? This depends on the project characteristics. If it's a reporting system, perhaps 90% of queries should have
NOLOCK; if it's a transaction system, perhaps 90% should not.Which is easier to distinguish? Sometimes, having a global default enabled makes it easier to distinguish.
- Default: No Hint Developers have to judge "Can this query be dirty-read?" every time they write a query. This means they must deeply understand the business logic, which has a higher judgment cost.
- Default: Hint Added Developers only need to judge "Can this query absolutely NOT be dirty-read? (e.g., deducting inventory)". Relatively speaking, these high-risk operations are easier to identify, and the threshold for misjudgment is lower.
Which carries the lowest risk if judged incorrectly?
- Default: No Hint: Usually the lowest risk (everyone agrees on this).
- Default: Hint Added + Misjudgment: Data inconsistency, transaction errors (severe).
- Default: No Hint + Misjudgment: Query slows down, potential deadlock (usually tolerable).
Although from point 3, "Default: No Hint" is the lowest risk and the most conservative choice in most cases, if combined with point 1 (business characteristics) and point 2 (difficulty of identification), some projects might find that "default global processing" actually significantly reduces the team's development burden and mental cost.
Regarding whether to use "explicit declaration" or "implicit black magic," the point is not the technical means itself, but whether it can effectively reduce the team's cognitive load and align with the project's risk tolerance. Discussing Anti-Patterns outside of specific contexts is often just another form of dogmatism.
Change Log
- 2026-02-05 Initial document created.
